AITopics | correlation detection

Collaborating Authors

correlation detection

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Model Correlation Detection via Random Selection Probing

Chen, Ruibo, Zhang, Sheng, Wu, Yihan, Zheng, Tong, Mai, Peihua, Huang, Heng

arXiv.org Artificial IntelligenceSep-30-2025

The growing prevalence of large language models (LLMs) and vision-language models (VLMs) has heightened the need for reliable techniques to determine whether a model has been fine-tuned from or is even identical to another. Existing similarity-based methods often require access to model parameters or produce heuristic scores without principled thresholds, limiting their applicability. We introduce Random Selection Probing (RSP), a hypothesis-testing framework that formulates model correlation detection as a statistical test. RSP optimizes textual or visual prefixes on a reference model for a random selection task and evaluates their transferability to a target model, producing rigorous p-values that quantify evidence of correlation. To mitigate false positives, RSP incorporates an unrelated baseline model to filter out generic, transferable features. We evaluate RSP across both LLMs and VLMs under diverse access conditions for reference models and test models. Experiments on fine-tuned and open-source models show that RSP consistently yields small p-values for related models while maintaining high p-values for unrelated ones. Extensive ablation studies further demonstrate the robustness of RSP. These results establish RSP as the first principled and general statistical framework for model correlation detection, enabling transparent and interpretable decisions in modern machine learning ecosystems.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2509.24171

Country:

North America > United States (0.46)
North America > Mexico (0.28)
Europe > Austria (0.28)

Genre: Research Report > Experimental Study (0.78)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

The graph alignment problem: fundamental limits and efficient algorithms

Ganassali, Luca

arXiv.org Machine LearningApr-18-2024

Similarly to many other inference problems in planted models, we are interested in understanding the fundamental information-theoretical limits as well as the computational hardness of graph alignment. First, we study the Gaussian setting, when the graphs are complete and the signal lies on correlated Gaussian edges weights. We prove that the exact recovery task exhibits a sharp information-theoretic threshold (and characterize it), and study a simple and natural spectral method for recovery, EIG1, which consists in aligning the leading eigenvectors of the adjacency matrices of the two graphs. While most of the recent work on the subject was dedicated to recovering the hidden signal in dense graphs, we next explore graph alignment in the sparse regime, where the mean degree of the nodes are constant, not scaling with the graph size. In this particularly challenging setting, for sparse Erdős-Rényi graphs, only a fraction of the nodes can be correctly matched by any algorithm.

eigenvector, likelihood ratio, section 1, (17 more...)

arXiv.org Machine Learning

2404.12418

Country:

North America > United States > New York > New York County > New York City (0.14)
Europe > Austria > Vienna (0.13)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(8 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.92)
Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Communications (1.00)
Information Technology > Data Science (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.92)
(3 more...)

Add feedback

Statistical limits of correlation detection in trees

Ganassali, Luca, Massoulié, Laurent, Semerjian, Guilhem

arXiv.org Machine LearningNov-8-2023

In this paper we address the problem of testing whether two observed trees $(t,t')$ are sampled either independently or from a joint distribution under which they are correlated. This problem, which we refer to as correlation detection in trees, plays a key role in the study of graph alignment for two correlated random graphs. Motivated by graph alignment, we investigate the conditions of existence of one-sided tests, i.e. tests which have vanishing type I error and non-vanishing power in the limit of large tree depth. For the correlated Galton-Watson model with Poisson offspring of mean $\lambda>0$ and correlation parameter $s \in (0,1)$, we identify a phase transition in the limit of large degrees at $s = \sqrt{\alpha}$, where $\alpha \sim 0.3383$ is Otter's constant. Namely, we prove that no such test exists for $s \leq \sqrt{\alpha}$, and that such a test exists whenever $s > \sqrt{\alpha}$, for $\lambda$ large enough. This result sheds new light on the graph alignment problem in the sparse regime (with $O(1)$ average node degrees) and on the performance of the MPAlign method studied in Ganassali et al. (2021), Piccioli et al. (2021), proving in particular the conjecture of Piccioli et al. (2021) that MPAlign succeeds in the partial recovery task for correlation parameter $s>\sqrt{\alpha}$ provided the average node degree $\lambda$ is large enough. As a byproduct, we identify a new family of orthogonal polynomials for the Poisson-Galton-Watson measure which enjoy remarkable properties. These polynomials may be of independent interest for a variety of problems involving graphs, trees or branching processes, beyond the scope of graph alignment.

artificial intelligence, correlation detection, machine learning, (14 more...)

arXiv.org Machine Learning

2209.13723

Country:

North America > United States > New York > New York County > New York City (0.14)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.34)

Add feedback

On Correlation Detection and Alignment Recovery of Gaussian Databases

Tamir, Ran

arXiv.org Artificial IntelligenceMay-25-2023

Two fundamental problems in the statistical analysis of Gaussian databases are correlation detection (or testing) and alignment recovery. Correlation detection of two databases is basically a hypothesis testing problem; under the null hypothesis, the databases are independent, and under the alternate hypothesis, there exists a permutation, for which the databases are correlated. In this task, the main objective is to attain the best trade-off between the type-I and type-II error probabilities. In the problem of database alignment recovery, we make an assumption that the two databases are correlated, and want to estimate the underlying permutation. The objective is to minimize the probability of alignment error. While alignment recovery of databases with n sequences, each containing d Gaussian entries has been recently studied in [1] and correlation detection of such Gaussian databases has been lately explored in [2], it seems very natural to tackle the two individual problems together as a joint problem of correlation detection and alignment recovery. In addition, we also refer to the problem of partial alignment recovery, in which one would like to estimate only part of the underlying alignment between the databases. The main reasons for preferring partial alignment recovery instead of full alignment recovery are as follows.

artificial intelligence, data mining, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2211.01069

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > New York (0.04)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.93)
Information Technology > Data Science > Data Mining (0.93)

Add feedback